Rankcluster: An R package for clustering multivariate partial rankings

نویسندگان

  • Julien Jacques
  • Quentin Grimonprez
  • Christophe Biernacki
چکیده

Rankcluster is the first R package proposing both modelling and clustering tools for ranking data, potentially multivariate and partial. Ranking data are modelled by the Insertion Sorting Rank (isr) model, which is a meaningful model parametrized by a central ranking and a dispersion parameter. A conditional independence assumption allows to take into account multivariate rankings, and clustering is performed by the mean of mixtures of multivariate isr model. The clusters’ parameters (central rankings and dispersion parameters) help the practitioners in the interpretation of the clustering. Moreover, the Rankcluster package provides an estimation of the missing ranking positions when rankings are partial. After an overview of the mixture of multivariate isr model, the Rankcluster package is described and its use is illustrated through two real datasets analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rankcluster: An R Package for clustering multivariate partial ranking

Rankcluster is the first R package dedicated to ranking data. This package proposes modelling and clustering tools for ranking data, potentially multivariate and partial. Ranking data are modelled by the Insertion Sorting Rank (isr) model, which is a meaningful model parametrized by a central ranking and a dispersion parameter. A conditional independence assumption allows to take into account m...

متن کامل

Weighted rank aggregation of cluster validation measures: a Monte Carlo cross-entropy approach

MOTIVATION Biologists often employ clustering techniques in the explorative phase of microarray data analysis to discover relevant biological groupings. Given the availability of numerous clustering algorithms in the machine-learning literature, an user might want to select one that performs the best for his/her data set or application. While various validation measures have been proposed over ...

متن کامل

Fitting loglinear Bradley-Terry models (LLBT) for paired comparisons using the R package prefmod

This paper aims at introducing the R package prefmod (Hatzinger, 2009) which allows the user to fit various models to paired comparison data. These models give estimated overall rankings of objects or items where each subject (respondent/judge) makes one or more comparisons between pairs of objects (items). The focus is on the loglinear Bradley-Terry (LLBT) model, the loglinear formulation of t...

متن کامل

The Kendall and Mallows Kernels for Permutations

We show that the widely used Kendall tau correlation coefficient, and the related Mallows kernel, are positive definite kernels for permutations. They offer computationally attractive alternatives to more complex kernels on the symmetric group to learn from rankings, or learn to rank. We show how to extend these kernels to partial rankings, multivariate rankings and uncertain rankings. Examples...

متن کامل

Clustering and Prediction of Rankings Within a Kemeny Distance Framework

Rankings and partial rankings are ubiquitous in data analysis, yet there is relatively little work in the classification community that uses the typical properties of rankings. We review the broader literature that we are aware of, and identify a common building block for both prediction of rankings and clustering of rankings, which is also valid for partial rankings. This building block is the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014